Rank | Count | Beginning |
---|---|---|
9138 | 2699 | Дар |
1243 | 1027 | Аз |
13693 | 975 | Ин |
3760 | 931 | Ба |
2782 | 570 | Аммо |
6275 | 391 | Бо |
7214 | 372 | Ва |
4639 | 369 | Барои |
833 | 356 | Агар |
7787 | 333 | Вале |
24520 | 279 | Ӯ |
17838 | 261 | Мо |
16646 | 259 | Ман |
12842 | 230 | Зеро |
20153 | 173 | Он |
24110 | 171 | То |
6944 | 170 | Бояд |
22691 | 170 | Соли |
20823 | 142 | Пас |
20323 | 139 | Онҳо |
29570 | 134 | Як |
13522 | 125 | Имрӯз |
23960 | 125 | Тибқи |
26941 | 120 | Ҳоло |
7494 | 117 | Вай |
26323 | 117 | Ҳар |
5625 | 112 | Баъди |
25780 | 109 | Ҳамин |
28378 | 108 | Чун |
26005 | 99 | Ҳамчунин |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV